
We claim: 

1 . A computer-implemented method comprising: 

allocating each of a plurality of items to at least one of a plurality of clusters, based 
on a predetermined criterion accounting for at least a quota for each item; 
5 selecting an item for a current cluster from items allocated to the current cluster; and, 
effecting the item. 

2. The method of claim 1, wherein the plurality of items comprises a plurality of ads, 
and effecting the item comprises displaying the ad. 

3. The method of claim 2, wherein the predetermined criterion further accounts for a 
10 constraint for each cluster. 

4. The method of claim 2, wherein the predetermined criterion further accounts for a 
particular one of the plurality of ads restricted from being shown in a particular one or 
more of the plurality of clusters. 

5. The method of claim 2, wherein the predetermined criterion comprises maximizing an 

15 expression ^p^x^j , where py comprises a probability that a user in cluster j will actuate 

y 

ad /. 
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6. The method of claim 5, wherein the predetermined criterion further comprises 

maximizing the expression subject to a constraint = q. , where qi comprises a quota 

J 

for ad /, and Xy comprises a total number of times ad / is shown in cluster j\ 

7. The method of claim 5, wherein the predetermined criterion further comprises 
5 maximizing the expression subject to a constraint ^x.j = Cj , where cj comprises a 

constraint for cluster j\ and Xij comprises a total number of times ad / is shown in cluster j. 

8. The method of claim 5, wherein the predetermined criterion comprises maximizing 

the expression subject to a first constraint ^x-j = q. , where qi comprises a quota for ad i, 

J 

and Xij comprises a total number of times ad / is shown in cluster j, and a second 
10 constraint ^x-j = Cj , where Cj comprises a constraint for cluster j\ and Xij comprises a 

total number of times ad / is shown in cluster j\ such that the expression, the first 
constraint and the second constraint define a Hnear program. 

9. The method of claim 8, wherein the linear program is solved by the Simplex 
Algorithm. 

15 10. The method of claim 2, wherein allocating each of a plurality of ads to at least one of 
the plurality of clusters comprises determining for each ad in each cluster a probability 
that a user in the cluster will actuate the ad. 
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11. The method of claim 10, wherein the probability that a user in the cluster will actuate 
the ad comprises the probability that a user in the cluster will click on the ad. 

12. The method of claim 10, wherein determining for each ad in each cluster a probability 
that a user in the cluster will actuate the ad comprises inputting training data from which 

5 to determine for each ad in each cluster the probability that a user in the cluster will 
actuate the ad. 

13. The method of claim 10, wherein determining for each ad in each cluster a probability 
that a user in the cluster will actuate the ad comprises utilizing at least one of: a 
maximum likelihood approach, a MAP method approach, and, a hierarchical Bayesian 

10 approach. 

14. The method of claim 2, wherein the predetermined criterion comprises maximizing an 
expected number of actuations of the plurality of ads, given the quota for each ad and the 
constraint for each cluster. 

15. The method of claim 2, wherein the constraint for each cluster comprises a total 
1 5 number of times the cluster is visited by any user. 

16. The method of claim 2, wherein the quota for each ad comprises a total number of 
times that the ad must be displayed. 
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17. The method of claim 2, wherein the criterion comprises favoring at least one ad over 
other ads within the plurality of ads in allocating the at least one ad. 

18. The method of claim 2, wherein the criterion comprises accounting for at least one 
house ad. 

5 19. The method of claim 2, wherein the predetermined criterion comprises minimizing an 

expression ^PyX^^ , where py comprises a probability that a user in cluster j will actuate 

u 

ad L 

20. The method of claim 2, wherein the predetermined criterion comprises maximizing an 

expression ^a^PyX-j , where py comprises a probability that a user in cluster j will 

u 

10 actuate ad /, and a. comprises a coefficient for the ad / to indicate weighting of the ad /. 

2 1 . The method of claim 5, wherein the predetermined criterion further comprises 
maximizing the expression subject to a constraint Xy=0 for a particular ad / within a 
particular cluster j, where Xy comprises a total number of times the ad i is shown in the 
cluster j. 

1 5 22. The method of claim 5, wherein the predetermined criterion further comprises 
maximizing the expression subject to a constraint ^jc^^. < Cj , where Cj comprises a 

constraint for cluster j\ and Xij comprises a total number of times ad / is shown in cluster j\ 
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23. The method of claim 10, wherein the probabiHty that a user in the cluster will actuate 
the ad comprises the probability that a user in the cluster will make a purchase based on 
the ad. 

24. The method of claim 2, wherein the method includes first initially defining the 
5 plurality of clusters. 

25. The method of claim 24, wherein defining the plurality of clusters comprises utilizing 
user information obtained without monitoring. 

26. The method of claim 24, wherein utilizing user information obtained without 
monitoring comprises utilizing a category tag (e.g., page group) of the page on which the 

10 item is to be displayed. 

27. The method of claim 25, wherein utilizing user information obtained without 
monitoring comprises utilizing user information obtained from the user via a 
questionnaire. 

28. The method of claim 24, wherein defining the plurality of clusters comprises utilizing 
15 a preexisting plurality of groups as the plurality of clusters. 

29. The method of claim 24, wherein defining the plurality of clusters comprises utilizing 
a Bayesian network. 
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30. The method of claim 24, wherein defining the pluraUty of clusters comprises utiUzing 
a naiVe-Bayes-network clustering approach. 

31. The method of claim 30, wherein utilizing a Bayesian network clustering approach 
comprises utilizing a bottleneck architecture. 

5 32. The method of claim 30, wherein utilizing a Bayesian network clustering approach 
comprises utilizing a bottleneck architecture recursively to construct a hierarchy of 
clusters 

33. The method of claim 30, wherein utilizing a Bayesian network clustering approach 
comprises training a Bayesian network using a stochastic gradient descent technique. 

10 34. The method of claim 30, wherein utilizing a Bayesian network clustering approach 
comprises employing a single hidden variable having a plurality of values. 

35. The method of claim 30, wherein utilizing a Bayesian network clustering approach 
comprises employing a plurality of hidden variables, each having two values. 

36. A computer-implemented method comprising: 

15 defining a plurality of clusters, each cluster corresponding to a group of users who are 

most receptive to a given type of ad; and, 

allocating an ad having a particular type to at least one cluster based on the particular 
type of the ad and based on a predetermined criterion. 
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37. The method of claim 36, wherein defining the plurality of clusters comprises utilizing 
user information obtained without monitoring. 

38. The method of claim 37, wherein utilizing user information obtained without 
monitoring comprises utilizing user information obtained from the user via a 
questionnaire. 

39. The method of claim 36, wherein defining a plurality of clusters comprises defining 
the plurality of clusters comprises utilizing a Bayesian network. 

40. The method of claim 36, wherein defining the plurality of clusters comprises utilizing 
a naive-Bayes-network clustering approach. 

41. The method of claim 40, wherein utilizing a Bayesian network clustering approach 
comprises utilizing a bottleneck architecture. 

42. The method of claim 40, wherein utilizing a Bayesian network clustering approach 
comprises utilizing a hierarchical bottleneck architecture. 

43. The method of claim 40, wherein utilizing a Bayesian network clustering approach 
comprises training a Bayesian network using a stochastic gradient descent technique. 

44. The method of claim 40, wherein utilizing a Bayesian network clustering approach 
comprises employing a single hidden variable having a pluraUty of values. 
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45. The-method of claim 40, wherein utilizing a Bayesian network clustering approach 
comprises employing a plurality of hidden variables, each having two values. 

46. A computer-implemented method comprising: 

determining an allocation for each of a plurality of ads to at least one of a plurality of 

5 clusters, given a constraint ^x-j = q- , where ^, comprises a quota for ad /, and xy 

J 

comprises a total number of times ad / is shown in cluster 7; and, 

outputting the allocation of each ad to at least one of the plurality of clusters. 

47. The method of claim 46, wherein determining an allocation for each of a plurality of 
ads to at least one of the plurality of clusters comprises maximizing an expression 

10 ^PijXij , where pij comprises a probability that a user in cluster j will actuate ad z, given 
the constraint. 

48. The method of claim 46, wherein determining an allocation for each of a plurality of 
ads to at least one of the plurahty of clusters comprises determining the allocation for 
each of the plurality of ads to at least one of the plurality of cluster further given a 

15 constraint ^x.. = Cj , where cj comprises a constraint for cluster j\ and xy comprises a 
total number of times ad / is shown in cluster 7. 

49. The method of claim 46, further comprising; 

selecting an ad for a current cluster from the allocation of each ad to the current 
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cluster; and, 

displaying the ad. 

50. A computerized system comprising; 

a database storing a plurality of ads, each ad having a quota; 
5 an allocator to allocate each of the plurality of ads to at least one of a plurality of 

clusters, based on a predetermined criterion accounting for at least the quota for each ad 
and a constraint for each cluster; and, 

a communicator to select an ad for a current cluster from ads allocated to the current 
cluster and output the ad to a user. 

10 51. The system of claim 50, wherein at least one of the allocator and the communicator 
comprises a computer program executed from a computer-readable medium by a 
processor. 

52. The system of claim 50, wherein the database is stored as data on a computer- 
readable medium. 

15 53. A machine-readable medium having instructions stored thereon for execution by a 
processor to perform a method comprising: 

allocating each of a plurality of ads to at least one of a plurality of clusters, based on a 
predetermined criterion accounting for at least a quota for each ad and a constraint for 
each cluster; 
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selecting an ad for a current cluster from ads allocated to the current cluster; and, 
displaying the ad. 



54. The medium of claim 53, wherein the predetermined criterion comprises maximizing 
an expression ^PyXy , where py comprises a probability that a user in cluster j will 

5 actuate ad /. 

55. The medium of claim 54, wherein the predetermined criterion further comprises 

maximizing the expression subject to a constraint = qr., where qi comprises a quota 

j 

for ad z, and Xy comprises a total number of times ad / is shown in cluster j, 

56. The medium of claim 54, wherein the predetermined criterion further comprises 
10 maximizing the expression subject to a constraint ^Xy = Cj , where cj comprises a 

constraint for cluster j\ and Xy comprises a total number of times ad / is shown in cluster j\ 

57. The mediimi of claim 53, wherein allocating each of a plurahty of ads to at least one 
of the plurality of clusters comprises determining for each ad in each cluster a probability 
that a user in the cluster will actuate the ad. 

15 58. The medium of claim 53, wherein the predetermined criterion comprises maximizing 
an expected number of actuations of the plurality of ads, given the quota for each ad and 
the constraint for each cluster. 
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59. A machine-readable medium having instructions stored thereon for execution by a 
processor to perform a method comprising: 

determining an allocation for each of a plurality of ads to at least one of a plurality of 

clusters, given a constraint = , where qi comprises a quota for ad /, and Xij 

j 

5 comprises a total number of times ad / is shown in cluster j\ and, 

outputting the allocation of each ad to at least one of the plurality of clusters. 

60. The medium of claim 59, wherein determining an allocation for each of a plurality of 
ads to at least one of the plurality of clusters comprises maximizing an expression 

^PijXij , where pij comprises a probability that a user in cluster j will actuate ad /, given 
10 the constraint. 

61. The medium of claim 59, wherein determining an allocation for each of a plurality of 
ads to at least one of the plurality of clusters comprises determining the allocation for 
each of the plurality of ads to at least one of the plurality of cluster further given a 

constraint ^^x-j = Cj , where cj comprises a constraint for cluster 7, and Xij comprises a 
15 total number of times ad i is shown in cluster 7. 

62. A computer-implemented method comprising: 

applying each of at least one first item to an ordered set of rules, each rule accounting 
for at least a quota for each of a plurality of second items, to determine a second item for 
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each of the at least one first item; and, 

effecting the second item for each of the at least one first item. 



63. The method of claim 62, wherein each first item comprises at least information about 
a user, and a web page currently being browsed by the user. 

5 64. The method of claim 62, wherein the plurality of second items comprises a plurality 
of ads, and effecting the second item comprises displajdng the ad. 

65. The method of claim 62, further initially comprising generating the ordered set of 
rules based on training data. 

66. The method of claim 65, wherein generating the ordered set of rules comprises: 

10 determining at least one significant correlation between a plurality of binary features 
of the training data and a plurality of activations of second items of the training data; 

determining a second item and at least one binary feature providing a largest 
activation; and, 

generating a rule based on the second item and the at least one binary feature 
1 5 providing the largest activation. 

67. The method of claim 66, wherein generating the ordered set of rules fiirther 
comprises: 

removing records fi-om the training data matching the rule generated; and. 
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repeating to generate another, lower-ordered rule while at least one significant 
correlation still exists. 

68. The method of claim 66, wherein determining at least one significant correlation 
comprises utilizing one of: Chi-squared method. Fisher exact test method, and Bayesian 

5 model selection method. 

69. A computer-implemented method comprising: 

determining at least one significant correlation between a plurality of binary features 
of the training data and a plurality of activation of items fi"om training data; 

determining an ad and at least one binary feature providing a largest activation, each 
10 rule accounting for at least a quota for the item; 

generating a rule based on the ad and the at least one binary feature providing the 
largest activation; 

removing records from the training data matching the rule generated; and, 
repeating to generate another, lower-ordered rule while at least one significant 
1 5 correlation still exists. 

70. The method of claim 69, wherein each item comprises an ad. 

71. A machine-readable medium having instructions stored thereon for execution by a 
processor to perform a method comprising: 

applying each of at least one fu-st item to an ordered set of rules, each rule accounting 
20 for at least a quota for each of a plurality of second items, to determine a second item for 
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each of the at least one first item; and, 

effecting the second item for each of the at least one first item. 



72. The medium of claim 71, the method further initially comprising generating the 
ordered set of rules based on training data. 

5 73. The medium of claim 71, wherein each first item comprises at least information about 
a user, and a web page currently being browsed by the user, and each second item 
comprises an ad. 

74. The medium of claim 71, wherein generating the ordered set of rules comprises: 
determining at least one significant correlation between a plurality of binary features 

10 of the training data and a plurality of activations of second items of the training data; 
determining a second item and at least one binary feature providing a largest 
activation; 

generating a rule based on the second item and the at least one binary feature 
providing the largest activation; and, 
15 removing records fi:om the training data matching the rule generated; and, 

repeating to generate another, lower-ordered rule while at least one significant 
correlation still exists. 

75. A machine-readable medium having instructions stored thereon for execution by a 
processor to perform a method comprising: 

20 determining at least one significant correlation between a plurality of binary features 
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of the training data and a plurality of activations of items from training data; 

determining an ad and at least one binary feature providing a largest activation, each 
rule accounting for at least a quota for the item; 

generating a rule based on the ad and the at least one binary feature providing the 
5 largest activation; 

removing records from the training data matching the rule generated; and, 

repeating to generate another, lower-ordered rule while at least one significant 
correlation still exists. 



76. The medium of claim 75, wherein each item comprises an ad. 

10 
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